Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is especially important for semantic segmentation tasks involving 3D datasets that are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on large unlabelled datasets is one way to reduce the amount of manual annotations needed. Previous work has focused on pre-training with point cloud data exclusively; this approach often requires two or more registered views. In the present work, we combine image and point cloud modalities, by first learning self-supervised image features and then using these features to train a 3D model. By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene. We demonstrate that our pre-training approach, despite using single scans, achieves comparable performance to other multi-scan, point cloud-only methods.
translated by 谷歌翻译
无缝人体机器人互动(HRI)和合作人员(HR)批判性地依靠准确和及时的人类心理工作量(MW)模型。认知负载理论(CLT)表明代表性的物理环境产生代表性的心理过程;物理环境保护程度对应于改进的建模精度。虚拟现实(VR)系统提供能够复制复杂情景的沉重环境,特别是那些与高风险高应力场景相关的复杂情景。被动生物数据建模显示了承诺作为MW建模的非侵入性方法。然而,VR系统很少包括多模式心理生理反馈或大写在线MW建模的生物功能数据。在这里,我们开发了一种新的VR仿真管线,受到NASA多属性任务电池II(MATB-II)任务架构的启发,能够在模拟危险勘探环境中同步地收集客观性能,主观性能和被动人体生物的。我们的系统设计提取并通过机器人操作系统(ROS)提取生物斑点,促进基于心理生理学的MW模型集成到完整的端到端系统中。能够在线评估MWS的VR模拟管道可以通过使这些系统能够以响应于操作者MW自适应地改变其行为来推进人力资源系统和VR经验。
translated by 谷歌翻译
Motion(SFM)的结构最近被称为一个自我监督的学习问题,在该问题中,深度和自我的神经网络模型通过视图合成共同学习。本文中,我们解决了如何最好的夫妇或链接,深度和自我网络组件的开放问题,以便可以在网络之间共享诸如共同规模因子之类的信息。为此,我们介绍了几个耦合的概念,对现有方法进行了分类,并提出了一种新颖的紧密耦合方法,该方法在训练时和测试时利用了深度和自我的相互依存关系。我们的方法使用迭代视图合成来递归更新eGomotion网络输入,从而允许在组件之间传递上下文信息。我们通过实质性实验证明,我们的方法在测试时间促进了深度和自我预测之间的一致性,改善了概括,并导致室内和室外深度和室外深度和自我评估基准的最新准确性。
translated by 谷歌翻译
成功的视觉导航取决于捕获包含足够有用信息的图像。在这封信中,我们探索了一种数据驱动的方法来说明环境照明的变化,改善了在视觉探测器(VO)或视觉同时定位和映射(SLAM)中使用的图像质量。我们训练深层卷积神经网络模型,以预测地调整相机增益和曝光时间参数,以便连续图像包含最大数量的可匹配功能。训练过程是完全自我监督的:我们的训练信号来自基础VO或SLAM管道,因此,对模型进行了优化,可以通过该特定管道进行良好的操作。我们通过广泛的现实世界实验证明,我们的网络可以预期并补偿急剧的照明变化(例如,过渡到道路隧道的过渡),比竞争摄像机参数控制算法保持了更高数量的Inlier功能匹配。
translated by 谷歌翻译
在许多领域,包括强化学习和控制在内的许多领域,从一系列高维观测中学习或识别动力学是一个困难的挑战。最近通过潜在动力学从生成的角度研究了这个问题:将高维观测结果嵌入到较低维的空间中,可以在其中学习动力学。尽管取得了一些成功,但尚未将潜在动力学模型应用于现实世界的机器人系统,在这些机器人系统中,学习的表示形式必须适合各种感知混杂和噪声源。在本文中,我们提出了一种共同学习潜在状态表示的方法以及在感知困难条件下的长期计划和闭环控制的相关动力。作为我们的主要贡献,我们描述了我们的表示如何能够通过检测新颖或分布(OOD)输入来捕获测试时间的异质或输入特异性不确定性的概念。我们介绍了有关两个基于图像的任务的预测和控制实验的结果:一个模拟的摆平衡任务和实现任务的现实世界机器人操纵器。我们证明,与仅在不同程度的输入降解的情况下,我们的模型可产生更准确的预测,并表现出改善的控制性能。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naive approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL sufficiently matches the expert distribution without fully learning the desired task. This can be particularly catastrophic for manipulation tasks, where the difference between an expert and a non-expert state-action pair is often subtle. We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of multiple exploratory, auxiliary tasks in addition to a main task. The addition of these auxiliary tasks forces the agent to explore states and actions that standard AIL may learn to ignore. Additionally, this particular formulation allows for the reusability of expert data between main tasks. Our experimental results in a challenging multitask robotic manipulation domain indicate that LfGP significantly outperforms both AIL and behaviour cloning, while also being more expert sample efficient than these baselines. To explain this performance gap, we provide further analysis of a toy problem that highlights the coupling between a local maximum and poor exploration, and also visualize the differences between the learned models from AIL and LfGP.
translated by 谷歌翻译
Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.
translated by 谷歌翻译
The Covid-19 pandemic induced a vast increase in adolescents diagnosed with eating disorders and hospitalized due to eating disorders. This immense growth stemmed partially from the stress of the pandemic but also from increased exposure to content that promotes eating disorders via social media, which, within the last decade, has become plagued by pro-eating disorder content. This study aimed to create a deep learning model capable of determining whether a given social media post promotes eating disorders based solely on image data. Tweets from hashtags that have been documented to promote eating disorders along with tweets from unrelated hashtags were collected. After prepossessing, these images were labeled as either pro-eating disorder or not based on which Twitter hashtag they were scraped from. Several deep-learning models were trained on the scraped dataset and were evaluated based on their accuracy, F1 score, precision, and recall. Ultimately, the vision transformer model was determined to be the most accurate, attaining an F1 score of 0.877 and an accuracy of 86.7% on the test set. The model, which was applied to unlabeled Twitter image data scraped from "#selfie", uncovered seasonal fluctuations in the relative abundance of pro-eating disorder content, which reached its peak in the summertime. These fluctuations correspond not only to the seasons, but also to stressors, such as the Covid-19 pandemic. Moreover, the Twitter image data indicated that the relative amount of pro-eating disorder content has been steadily rising over the last five years and is likely to continue increasing in the future.
translated by 谷歌翻译
We introduce a pivot for exact selective inference with randomization. Not only does our pivot lead to exact inference in Gaussian regression models, but it is also available in closed form. We reduce the problem of exact selective inference to a bivariate truncated Gaussian distribution. By doing so, we give up some power that is achieved with approximate inference in Panigrahi and Taylor (2022). Yet we always produce narrower confidence intervals than a closely related data-splitting procedure. For popular instances of Gaussian regression, this price -- in terms of power -- in exchange for exact selective inference is demonstrated in simulated experiments and in an HIV drug resistance analysis.
translated by 谷歌翻译